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Abstract. We describe a system for meta-analysis where a wiki stores 
numerical data in a simple format and a web service performs the nu- 
merical computation. We initially apply the system on multiple meta- 
analyses of structural neuroimaging data results. The described system 
allows for mass meta-analysis, e.g., meta-analysis across multiple brain 
regions and multiple mental disorders. 

1 Introduction 

The scientific process aggregates a large number of scientific results into a com- 
mon scientific consensus. Meta-analysis performs the aggregation by statistical 
analysis of numerical values presented across scientific papers. Collaborative sys- 
tems such as wikis may easily aggregate text and values from multiple sources. 
However, so far they have had limited ability to apply numerical analysis as 
required, e.g., by meta-analysis. 

Researchers have discussed the advantages and disadvantage of the tools 
for conducting systematic reviews from "paper and pencil" , over spreadsheets to 
RcvMan and web-based specialized applications [10]: Setup cost, versatility, abil- 
ity to manage data, etc. In 2009 they concluded that "no single data-extraction 
method is best for all systematic reviews in all circumstances" . For example, 
RevMan and Archie of the Cochrane Library provide an elaborate system for 
keeping track and analyzing textual and numerical data in meta-analyses, but 
the system could not import information from electronic databases [10]. Our 
original meta-analyses [4, 5] relied on the Microsoft Excel spreadsheets later 
distributed on public web sites. Compared to an ordinary spreadsheet a wiki 
solution provides data entry provenance and collaborative data entry with im- 
mediately update. Shareable folders on cloud-based storage systems would help 
collaboration on spreadsheets, but yield no provenance. Online services, such as 
the spreadsheet of Google Docs, may lack meta-analytic plotting facility. Web- 
based specialized applications for systematic reviews may have a high setup cost 

We have previously explored a simple online meta-analysis system — a "fielded 
wiki" — in connection with personality genetics [8]. As implemented specifically 
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for this scientific area the web service lacks generality for other types for meta- 
analytic data. Furthermore the system relied on PubMed or Brede Wiki to rep- 
resent bibliographic information. 

Following Ward Cunningham's quote "What's the simplest thing that could 
possibly work?" we present a simple system that allows for mass meta-analysis 
of numerical data presented as comma-separated values (CSV) in a standard 
MediaWiki-based wiki, — the Brede Wiki: http://neuro.imm.dtu.dk/wiki/. 

2 Data and data representation 

We use the MediaWiki-based Brede Wiki to represent the data [1]. For our 
neuroimaging data each data record usually consists of three values (number 
of subjects, their mean and standard deviation). The individual study typically 
compares two such data records, e.g., from a patient and a control group. We 
also record labels for the data record, e.g., the biographic information, as well as 
extra subject information about the two groups, such as age, gender and clinical 
characteristics, so that the total number of data items for each study may be 
seven or more. Each meta-analysis will usually determine what extra relevant in- 
formation should be included and it may differ between studies, e.g., a Y-BOCS 
value has typically only relevance for obsessive-compulsive disorder patients. The 
functional neuroimaging area has CogPO and Cognitive Atlas ontologies enabling 
researchers to describe the topic of an experiment, but these efforts do not di- 
rectly apply to our data. One CSV line carries the information for each study. 

Separate wiki pages store- 
rather than uploaded files — 
the CSV data, so the Media- 
Wiki template functionality 
can transclude the CSV data 
on other wiki pages. By con- 
vention pages with CSV in- 
formation have the ".csv" ex- 
tension as part of the title 
so external scripts can recog- 
nize them as special pages and 
the wiki pages have no wiki 
markup. 

Media Wiki templates may 
generate links for download, 
editing and meta-analysis of the data. Presently, no controlled vocabulary be- 
yond the template fields describes the columns in the CSV. To generate an 
appropriate content-type (text/csv) a bridging web script functions as a proxy, 
so a download of the CSV page can spawn a client-side spreadsheet program. 

A few MediaWiki extensions can format CSV information: SimpleTable and 
TableData. Figure 1 shows the transclusion of CSV data with a modified ver- 
sion of the SimpleTable extension. The Brede Wiki uses the standard template 
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Fig. 1. Screenshot from the wiki showing CSV 
data transcluded on a page. 



{{Metaanalysis csv begin}} 
{{Metaanalysis csv 

I title = Major Depressiv 

I topicl = Pituitary 

I topic2 = Major depressive 

I topic3 = MaND 
» 

{{Metaanalysis csv 

I title = Obsessive-compulsi 

I topicl = Pituitary 

I topic2 = Obsessive-compuls 

I topic3 = ObND 
» 

{{Metaanalysis csv end}} 



Disorder Weuroimaging Database - Pituitary, total 



i disorder Neuro imaging Database - Pituitary 



system for recording structured bibliographic data about the publication and to 
annotate the CSV information, see Figure 2. 

The bulk of the data cur- 
rently presented in the wiki 
comes from the large mass 
meta-analysis of volumetric 
studies on major depressive 
disorder reporting over 50 
separate meta-analyses for in- 
dividual brain regions [4]. 
Further data comes from 
mass meta-analyses across 
multiple brain regions on 

bipolar disorder [5] and hrst-episode schizophrenia [6] , a meta-analysis on longi- 
tudinal development in schizophrenia [7] as well as data from individual original 
studies on obsessive-compulsive disorder. 

Apart from neuroimaging studies the Brede Wiki also records data from 
meta-analyses from a few other studies outside neuroimaging [2], allowing us to 
test the generality of the framework. The data is distributed under ODbL. 



Fig. 2. Template to annotate the CSV data and 
define the links to the meta-analysis. 



3 Web script and meta-analysis 

The web script for meta-analysis reads the CSV information, identifies the re- 
quired columns for meta-analysis, performs the statistical computations and 
makes meta-analytic plots — the so-called forest and funnel plots — in the SVG 
format, see Figure 3. From either the title information or a PubMed identifier 
the script generates back-links from the generated page to pages on the wiki. 
The script may also export the computed results as JSON or CSV. Furthermore, 
it may generate a small R script that sets up the data in variables and use the 
meta library for meta-analysis. 

The web script attempts to guess the separator used on the CSV page and 
also tries to match the elements of the column header, e.g., the strings "control 
n", "controls number", "number of controls", etc. match for number of con- 
trol subjects. With no matches the user needs to explicitly specify the relevant 
columns via URL parameters, which in turn a wiki template can setup. 

Standard meta-analysis computes an effect size from each result in a paper 
and computes a combined meta-analytic effect size and its confidence inter- 
val. Although the methodological development continues, there exist established 
statistical analysis approaches for ordinary meta-analysis [2]. Our system im- 
plements computations on the standardized mean difference for continuous vari- 
ables and on the natural logarithm of the odds ratio for categorical variables with 
fixed and random effects methods using an inverse-weighted variance model, - 
following the approach in the Stata program. As an extra option we provide 
meta-analysis on the natural logarithm of the variance ratio [3] , for comparison 
of the standard deviations between two groups of subjects. 



4 Results 



We have added 124 
pages with CSV data, 
— most of which con- 
tain data suitable for 
meta-analysis. For in- 
dividual analyses the 
reading, computation and 
download finish within 
seconds. With multiple 
calls to the web script 
and JSON output an- 
other script can plot 
multiple meta-analytic 
results together as in 
Figure 4. Generating 
such a plot takes several 
minutes. For generating 
the page shown in Fig- 
ure 3 we need only the 
CSV data and the web 
script, while the script that generated Figure 4 used information defined in tem- 
plates, CSV data and the web script with no further adaption of MediaWiki. 

5 Discussion 

By using MediaWiki in our present system we exploit the template facility to 
capture structured information, and free-form wikitext for annotation and com- 
ment on the individual scientific papers, — as in semantic academic annotation 
wikis AcaWiki and WikiPapers. It is also possible to use the pages of the wiki 
as a simple means to keep track of the status of the papers considered for the 
meta-analysis: potentially eligible, eligible, partially entered and fully entered. 

Why not Semantic MediaWiki? Semantic MediaWiki (SMW) may query text 
and numerical data, though has not had the ability to make complex computa- 
tions. The Semantic Result Formats extension includes average, sum, product 
and count result formats enabling simple computations of a series of numerical 
values, but insufficient for the kind of computations we require. The data for 
meta-analysis form a n-ary data record (mean, standard deviation, number of 
subjects, labels) so either individual SMW pages should store each data record 
or we should invoke the n-ary functionality in Semantic Internal Objects SMW 
extension, SMW record or the recently-introduced subobject SMW functional- 
ity. We have not investigated whether these tools provide convenient means for 
representing our data. The Brede Wiki can export its ontologies defined in Me- 
diaWiki template to SKOS. Our future research can consider RDFication of the 
CSV information through the SCOVO format [9]. 
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Fig. 3. Screenshot of web script showing the meta- 
analytic results with forest and funnel plots. 



We wrote the web 
service in Python, 
where Numpy makes 
vector computation 
available and Scipy 
provides statistical 
methods, necessary for 
the computation. In 
a future PHP imple- 
mentation the script 
could more closely 
integrate with the wiki 
as either a MediaWiki 
or a SMW extension. 

A wiki built from 
standard components 
provides a inexpensive 
solution with means to 
manage meta-analytic 
data in a collaborative 
environment. The gen- 
eral framework allows 
not only the meta-analysis of neuroimaging-derived data but has the potential 
for managing and analyzing data from many other domains. 
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Fig. 4. Results from mass meta-analyses shown in a 
L'Abbe-like plot and constructed by calling the web 
script multiple times. Each dot corresponds to a meta- 
analysis. Uncertainty as a function of effect size with 
size of each dot determined by the number of subjects. 
The line indicates 0.05-significance. 



